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ABSTRACT 

We show that the Cepheid PL relation is affected by a Malmquist type bias, 
the so-called population incompleteness bias. Its calculated slope appears shallower 
than the true one because of the cutoff in apparent magnitude resulting from the 
I instrumental limiting magnitude. Furthermore, the use of the PL relation, even with 

00 ' the correct slope, leads to underestimation of distances which is not negligible. We 

' confirm this finding by studying simulated PL relations, and we show that this bias 

may be as large as 0.2 or 0.3 magnitudes on distance modulus. We also test the 
efficiency of a cutoff in log-P and show that it is a good way to minimize this bias. 
. However, a correction of this is difficult as long as the completeness of the sample is 

! not perfectly well established. 
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Subject headings: Cepheids — distance scale 



1. Introduction 

The Cepheid Period-Luminosity relation (hereafter PL relation) is one of the major tools used 
for computing extragalactic distances from data like those provided by the Hubble Space Telescope 
(HST). 

However, Teerikorpi ( [L984 ) has pointed out for a long time that such a linear relationship is 



prone to be affected by biases. For extragalactic Cepheids a bias is expected, identical to that 



encountered in a cluster of galaxies, the so-called population incompleteness bias ( Teerikorpi 



19871 , iBottinelh et al. 19871) 



Sandage ( 19881) has already noticed that truncating a complete sample of Cepheids in LMC leads 



to too shallow a slope of the PL relation. We intend to show that HST observations, although 
much deeper in apparent magnitude, are affected by this kind of bias. This is because a sample 
of Cepheids in a given external galaxy (thus, at the same distance) is complete only at the bright 
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end of the Cepheid luminosity function (i.e. only for large logP values), since observations are 
limited to a given apparent limiting magnitude. 

After introducing the data and the method used (section 2), we investigate how such a bias affects 
the slope, a, (section 3), by studying NGC 4536 as an example. We then highlight the existence 
of the bias in this way. In section 4 we analyze the effect on the zero-point b, using the correct 
slope of the PL-relation. Section 5 presents a simulation that confirms previous results and gives 
an estimate of the bias, as well as its reduction by applying a cutoff in logP . 
It is important to address this question because its effect is poorly studied at the present time, 
and, yet, is larger than generally admitted. 



2. Data and distance moduli 

We use a sample of 750 Cepheids which have both V and I photometry available from 
literature. These Cepheids are located in 23 galaxies, 14 of them have Cepheids observed with 
HST whereas the other 9 galaxies have only ground based observations. We checked all light 
curves in order to detect overtone pulsators (low amplitude, logP < 1 and symmetrical curves) 
and to remove them from our sample since they are subject to a different PL relation (we didn't 
try to correct their period). 

In order to compute the distance moduli of these galaxies, we choose the absolute calibration of 
the PL relation from Gieren et al. ( |1998| ). Their PL calibration is based on infrared Barnes- Evans 
surface brightness technique (note that it is insensitive to both Cepheid metallicity and reddening). 
We then avoid comparing PL relations of different galaxies with different metallicities and 
inaccurate reddenings. The PL relations are : 

My = -2.769 log P- 1.294 (1) 
M/ = -3.041 log P- 1.726 (2) 
We can compute easily the V and I apparent distance moduli for each Cepheid : 

^iv = V-Mv (3) 
fii = I-Mi (4) 
The supposed true distance moduli are then : 

Ato = /^y - R{l^v - IJ'i) (5) 

where 

R = ^ = 2.446 (6) 

Av -At 



according to Cardelli et al. (1989) , Ay and A^ being respectively the V and I extinction 
coefficients. The supposed true distance modulus of a galaxy is then assumed to be the mean of 
the individual distance moduli of its Cepheids. 
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3. Effect on the PL relation slope 

The V-band slope is very well defined from ground-based observations of LMC Cepheids. 
Gieren et al. ( |l998D found a = -2.769 ± 0.073. Tanvir ( |l998| ) found a = -2.774 ± 0.083 and 



Madore & Freedman ( |1991| ) found a=-2.76±0.11. We will adopt : 

a = -2.77 ±0.09 (7) 

The slope of the PL relation cannot be determined from distant galaxies. We almost always obtain 
apparent slopes (a') significantly shallower than the LMC one (assumed to be the true one). Yet, 
the slope of the PL relation does not vary with metallicity. This is predicted by models ( phiosi et 



al. 1993| ) and is widely acknowledged even by people who claim that metallicity effects are strong 



(Beaulieu et al. 1997). Therefore, we cannot argue about metallicity effects. 



The low value of a' can actually be traced to the population incompleteness bias resulting 
from apparent limiting magnitudes {Vum ~ 26.5 for HST measurements and Vnm ~ 21.5 for 
ground-based observations). Figure |l| and ^ show respectively the V and I PL relations for one 
of the best measured galaxies, NGC 4536, and illustrate how the bias works. For instrumental 
reasons, apparent magnitudes are limited to a given limit Vnm- Thus, the distribution in the plot 
V against log P (or similarly My against log P for a given distance modulus) is distorted. If we 
force a linear regression, the line will tend to pass by the point having the largest inertia (note A 
in fig. H and ^). This leads to a shallower slope. Assuming a Gaussian distribution of residuals (as 
a first approximation of top-hat distribution), the model given by Paturel et al. ( |1997 ) predicts a 
slope a' : 

a' = (8) 

1 + fj/ [Vlim - ^J'- Mm ax) 

where a is the unbiased slope, a is the scatter of the V band PL relation in magnitudes (cr ~ 0.3 
mag.), is the adopted distance modulus and Mmox the brightest end of absolute V magnitudes 
{MMax ~ —7). Figure |3| gives the predicted variation of a' with the distance modulus for both 
ground-based and HST observations. On the same figure we give the observed slopes obtained by a 
direct regression for Cepheids in 23 galaxies (ground-based and HST observations are represented 
with different symbols). The agreement fully supports the interpretation of the bias and proves in 
this way that PL relations in external galaxies are definitely affected by incompleteness bias. It 
must be noted that the slope dramatically increases with distance modulus, preventing us from 
using an observed slope without caution. 

The inverse regression line is a way which should lead to the correct result (Kelson et al. 1996). 



Unfortunatelly, it is subject to the opposite bias (bias in logP) because very long period Cepheids 
are difficult to detect. Furthermore, the inverse slope varies with the scatter of the sample 
prohibiting the direct comparison with calibrating PL relation whose scatter is smaller. This 
situation is identical to that encountered for the inverse Tully-Fisher relation (see [Teerikorpi 1990 ). 
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4. Effect on the zero-point 

We would like to stress the most important consequence of the bias. Even with the right 
slope, the determination of the distance modulus will be underestimated when data are limited 
in apparent magnitude. This fact can easily be understood from the distribution of the points 
in figures |l| and |2|. Because of the cutoff in magnitude there are more points above the mean PL 
line for short periods. Of course, since the residuals on ^uy and /x/ are correlated, the effect of 
incompleteness is reduced in the second part {^y — l^i) of the equation (P). However, the effect on 
the first part {^v) of this equation remains affected by incompleteness, and this term is moreover 
one order larger than the second one {R{fJ,v — fJ'i)/lJ'V < 5%). 

In order to highlight this effect, the mean distance modulus is calculated for NGC 4536 by 
progressively removing the shortest period Cepheids. Figure ^ shows the variation of the mean 
distance modulus with the lower logP cutoff (logP;). The first point (on the left) represents the 
mean distance modulus of NGC 4536 when using its 73 Cepheids vs. the lower log P of these 73 
Cepheids. The next point represents the mean distance modulus vs. the lower logP of the 72 
Cepheids having the larger periods. The same applies to 71 Cepheids until 1 Cepheid. The effect 
on the mean distance modulus as a function of period cutoff is then directly readable on this 
figure. Presently, the detection of a plateau in the growth curve of mean distance modulus seems 
the most efficient way to get unbiased distance moduli from Cepheids PL relation. The beginning 
of the plateau indicates the cutoff in logP which should be applied (logP^ = 1.25 in this case). 
In this example, the biased modulus equals to 30.50 while the unbiased modulus is 30.76. This 
effect is therefore, far from being negligible. The rule of thumb for calculating unbiased distance 
moduli is simply to use only log P values larger than the given limit log Pi . This limit can also be 
calculated from figure ||: 

log Pi = (9) 

a 

For NGC 4536, using the PL relation given in equation (|^), and the same parameters as in relation 
(H)) yiim = 26.5 and a = 0.3, we obtain log Pi = 1.19 for /x = 30.50. From the growth curve of 
distance modulus (fig. |^ it appears that this limit is quite acceptable. The definition of Vum is 
namely crucial and difficult. 



5. Simulation 

5.1. Construction of simulated PL relations 

In order to confirm that our interpretation of the incompleteness bias is the right one, we 
construct simulated PL relations in the following way : for each Cepheid of a simulated galaxy, we 
first choose a true distance modulus {^i-True) and we determine randomly according to a Gaussian 
shape the following values : 

• log P ; (< log P >= 1.0, a = 0.35) 
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• Eb-v = {B-V)-{B- V)o ; (< Eb^v >= 0.08, a = 0.04) 

Using equations (|l|) and (^) we compute the V and I absolute magnitudes of the CepheidQ and add 
to them a random dispersion A, intrinsic dispersion of PL relations with < A >= and ita = 0.3. 
We then compute and add V and I absorptions according to Ay = 3.3Eb-v and Aj = 1.95Eb-v ■ 
We add random error measurements, e, calculated as a linear function of the distance modulus, 
e = 0.2 — (32 — ^j'rne)/45, with no correlation between V and I, and obtain : 

V = UTrue - 2.769 log P - 1.294 + A + Av + ev (10) 

I = fJ-True - 3.041 log P - 1.726 + A + Ai + ei (11) 

Finally, we compute the probability for the Cepheid to be detected both in the V and I bands. We 
draw a random parameter t G [0, 1] and compute the quantity : 

Whenever t < Iq, the Cepheid may be observed by the HST and we keep it in our simulated galaxy. 
In the other case it will be rejected. We assume a = 3 and = 26.5 for HST measurements. 
We also assume that a typical galaxy has 250 Cepheids, so that about 80 may be detected for 

IJ'True ~ 30.5 



5.2. Results 

We first simulate a galaxy similar to NGC 4536 by locating it at a true distance modulus 
fJ-True = 30.76. Figures ^ and ^ show respectively the kind of absolute V and I PL relations we 
obtain. This simulated galaxy has for instance 77 observable Cepheids and is comparable in 
every respect to NGC 4536. We can reduce these data to compute its distance modulus (/io) 
according to the procedure described in section 2. 
We find : 

Ho = 30.59 (13) 

And then the bias is : 

fJ-O - fJ'True = "0.17 (14) 

At the very least, when using HST data, authors involved in the HST Key Project make a cut at 
logP = 1. When we choose such a lower limit in our simulation, the bias induced is worth —0.12 
mag. Figure |^ shows the variation of its computed distance modulus as a function of logP/. The 
behaviour is the same as the one observed for NGC 4536 (see fig. ^ and, as such, it confirms our 
interpretation. We plot the plateau on figure 0, which is obtained with a cutoff at logP/ = 1.24 



^note that the color distribution {V — /)o ~ 0.272 log P + 0.432 is underlying 
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(then < logP >= 1.45). 

If we compute the theoretical cutoff according to equation @, we obtain logP; = 1.23 in close 
agreement with the real value. This formula (|9|) may be very useful whenever the indentification 
of the plateau is difficult. 

Moreover, in order to make sure that the result does not depend on this particular random draw, 

we prepare a random set of 200 galaxies and we place them at increasing distance moduli. 

We first test the variation of the slope of the PL relation : figure ^ shows the variation of the 

slope as a function of iJ-xrue- This figure is directly comparable to figure ^ and fully supports our 

interpretation. 

Of greater significance is our evaluation of the bias on the mean distance moduli < fiQ > from the 
same set of galaxies as a function of fiTrue when the correct slope is used. One can see that the 
bias is indeed far from negligible (fig. ^) . It reaches : 

< H0> -MTrne = 0.20 ± 0.01(0" = 0.20) (15) 

for fiTrue = 32.2 and = 26.5. 

We also test the efficiency of the bias correction by applying a logP cutoff. We use our model 
(see eq. §]) and compute again < ^uq > — /iTrue (open dots in fig. |9|). The importance of the bias 
is strongly reduced in this manner although it still exists. Clearly, the bias depends critically on 
the limiting Vnm- Our simulation is made with a constant = 26.5, while, e.g. in the Key 
Project conducted with the HST, distant galaxies have observations taken with longer exposure 
times leading to fainter Vum (for galaxies in Virgo and Fornax clusters, the limiting magnitude 
may be between 26.5 and 27 mag). This will infer different biases depending on the actual limiting 
magnitude. 

From a practical standpoint, one way of taking this bias into account might be to cut systematically 
the studied sample in logP according to equation For instance, the effects on the distance 
moduli of NGC 4321 and NGC 2541 are respectively 0.09 and 0.05 magnitudes. Another approach, 
more statistically oriented, could be to use values given in figure |9| and to add the value of the bias 
we compute to the distance modulus obtained from the complete sample. 



6. Conclusion 

In conclusion, we draw the attention to the difficulties involved in using the PL relation. The 
slope cannot be determined without considering the population incompleteness bias. Furthermore, 
it is most important to note that the use of the relation, even with the correct slope, should 
at least be limited to long period Cepheids. The correction for bias effect is difficult as long as 
the completeness of the sample is not perfectly well established. We consider, however, that 
this incompleteness bias should be systematically taken into account when deriving extragalactic 
distances from Cepheid PL relation, for instance according to the equation (^, even though it 
may be of little importance in some cases. 
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Fig. 1. — See next figure. 

Fig. 2. — NGC 4536 V and I band absolute PL relations. The absolute magnitudes are simply 
My = V— < fiv > and Mj = I— < fij > , < fiy > and < /i/ > being respectively the mean V and I 
apparent distance modulus of the considered galaxy. The solid line corresponds to the adopted PL 
relation (eq. ^ and [Q]). The dotted lines give the corresponding ±2a limits. The dashed line gives 
the relation calculated from a direct regression. The horizontal line gives the approximate cutoff 
in absolute magnitude resulting from the instrumental limiting magnitude. The linear regression 
(dashed line) forced onto the distorted distribution tends to pass through point A, leading to too 
shallow a slope. 

Fig. 3. — Apparent slope a' of the PL relation as a function of the distance modulus. The trend 
is typical of a statistical bias. Open stars represent ground-based observations whereas filled stars 
correspond to HST observations. The solid curves give the predicted variation in both cases from 
our model (see eq. Q). 

Fig. 4. — Variation of the mean distance modulus vs. logP^ for NGC 4536. The mean distance 
modulus depends on the adopted cutoff in logP (see text). 

Fig. 5. — See next Figure. 

Fig. 6. — Simulation of V and I band absolute PL relations. The simulated galaxy is located at 
IJ-True = 30.76. The solid line corresponds to the adopted PL relation (see eq. |^ and |]l|). The 
dotted lines give the corresponding ±2(7 limits. The dashed line gives the relation calculated from 
a direct regression. 

Fig. 7. — Variation of the mean distance modulus vs. log Pi for a typical simulated galaxy located 
at fJ^True = 30.76. The mean distance modulus depends on the adopted cutoff in logP (see text). 

Fig. 8. — Simulation of the bias affecting the slope of the PL relation as a function of the distance 
modulus. The plotted slope is the mean value of a set of 200 simulated galaxies having increasing 
distance moduli. The horizontal dotted line indicates the unbiased value. The curve represents our 
model given by equation (^). 

Fig. 9. — Simulation of the bias we made while computing the distance modulus of a galaxy, 
because of incompleteness. Filled dots represent the mean error on /io — iJ-True as a function of 
fJ-True, for a Set of 200 galaxies, when keeping all observed Cepheids in the calculation. Open dots 
represent the same quantity when applying a cutoff in log P according to equation (P) . One can 
see that this procedure significantly reduces the error on the distance modulus. 
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